Cyberbullying Detection based on Text-Stream Classification
نویسندگان
چکیده
Current studies on cyberbullying detection, under text classification, mainly assume that the streaming text can be fully labelled. However, the exponential growth of unlabelled data in online content makes this assumption impractical. In this paper, we propose a session-based framework for automatic detection of cyberbullying from the huge amount of unlabelled streaming text. Given that the streaming data from Social Networks arrives in large volume at the server system, we incorporate an ensemble of one-class classifiers in the session-based framework. The proposed framework addresses the real world scenario, where only a small set of positive instances are available for initial training. Our main contribution in this paper is to automatically detect cyberbullying in real world situations, where labelled data is not readily available. Our early results show that the proposed approach is reasonably effective for the automatic detection of cyberbullying on Social Networks. The experiments indicate that the ensemble learner outperforms the single window and fixed window approaches, while learning is from positive and
منابع مشابه
Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملAutomatic Detection of Cyberbullying in Social Media Text
While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to i...
متن کاملContent-Driven Detection of Cyberbullying on the Instagram Social Network
We study detection of cyberbullying in photosharing networks, with an eye on developing earlywarning mechanisms for the prediction of posted images vulnerable to attacks. Given the overwhelming increase in media accompanying text in online social networks, we investigate use of posted images and captions for improved detection of bullying in response to shared content. We validate our approache...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کامل